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Abstract. An algorithm to decide the emptiness of a regular type ex- 
pression with set operators given a set of parameterised type definitions 
is presented. The algorithm can also be used to decide the equivalence 
of two regular type expressions and the inclusion of one regular type ex- 
pression in another. The algorithm strictly generalises previous work in 
that tuple distributivity is not assumed and set operators are permitted 
in type expressions. 
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1 Introduction 

Types play an important role in programming languages ||. They 
make programs easier to understand and help detect errors. Types 
have been introduced into logic programming in the forms of type 
checking and inference @,|JlJ,|26|,|32| or type analysis ^,0,0,0,0,0,0,0] 
or typed languages |T^,|2T]J5^,|HT| . Recent logic programming systems 
allow the programmer to declare types for predicates and type errors 
are then detected either at compile time or at run time. The reader 
is referred to |27J for more details on types in logic programming. 

A type is a possibly infinite set of ground terms with a finite 
representation. An integral part of any type system is its type lan- 
guage that specifies which sets of ground terms are types. To be 
useful, types should be closed under intersection, union and com- 
plement operations. The decision problems such as the emptiness of 
a type, inclusion of a type in another and equivalence of two types 
should be decidable. Regular term languages |14|fl, called regular 
types, satisfy these conditions and have been used widely used as 
types 1^^,1011^00,0,0,0,11,0^1]. 

Most type systems use tuple distributive regular types which are 



strictly less powerful than regular types 



Tuple distributive regular types are regular types closed under tuple 
distributive closure. Intuitively, the tuple distributive closure of a set 
of terms is the set of all terms constructed recursively by permuting 
each argument position among all terms that have the same function 
symbol 



This paper gives an algorithm to decide if a type expression de- 
notes an empty set of terms. The correctness of the algorithm is 
proved and its complexity is analysed. The algorithm works on pre- 
scriptive types P8|. By prescriptive types, we mean that the mean- 
ing of a type is determined by a given set of type definitions. We 
allow parametric and overloading polymorphism in type definitions. 
Prescriptive types are useful both in compilers and other program 
manipulation tools such as debuggers because they are easy to under- 
stand for programmers. Type expressions may contain set operators 
with their usual interpretations. Thus, the algorithm can be used to 
decide the equivalence of two type expressions and the inclusion of 
one type expression in another. The introduction of set operators 
into type expressions allows concise and intuitive representation of 
regular types. 

Though using regular term languages as types allow us to make 
use of theoretical results in the field of tree automata |14]], algorithms 
for testing the emptiness of tree automata cannot be applied directly 
as type definitions may be parameterised. For instance, in order to 
decide the emptiness of a type expression given a set of type defi- 
nitions, it would be necessary to construct a tree automaton from 
the type expression and the set of type definitions before an algo- 
rithm for determining the emptiness of an tree automaton can be 
used. When type definitions are parameterised, this would make it 
necessary to construct a different automaton each time the empti- 
ness of a type expression is tested. Thus, an algorithm that works 
directly with type definitions is desirable as it avoids this repeated 
construction of automata. 

Attempts have been made in the past to find algorithms for reg- 
ular types |^ . p^2| . |32| , |33| , |3ll , |l0| ,|9| . To our knowledge, Dart and Zobel's 
work is the only one to present decision algorithms for emptiness 
and inclusion problems for prescriptive regular types without the tu- 
ple distributive restriction. Unfortunately, their decision algorithm 
for the inclusion problem is incorrect for regular types in general. 
See p3 for a counterexample. Moreover, the type language of Dart 



and Zobel is less expressive than that considered in this paper since 
it doesn't allow set operators and parameterised type definitions. 

Set constraint solving has also been used in type checking and 
type inference [fl,El,P0[ I^,P1- However, set constraint solving meth- 
ods are intended to infer descriptive types rather than for testing 
emptiness of prescriptive types p8|| . Therefore, they are useful in dif- 
ferent settings from the algorithm presented in this paper. Moreover, 
algorithms proposed for set constraint solving are not appli- 

cable to the emptiness problem we considered in this paper as they 
don't take type definitions into account. 

The remainder of this paper is organised as follows. Section ^| 
describes our language of type expressions and type definitions. Sec- 
tion |3| presents our algorithm for testing if a type expression denotes 
an empty set of terms. Section [| addresses the of the algorithm. 
Section |5| presents the complexity of the algorithm and section ^| 
concludes the paper. Some lemmas are presented in the appendix. 



2 Type Language 

Let E be a fixed ranked alphabet. Each symbol in E is called a 
function symbol and has a fixed arity. It is assumed that E contains 
at least one constant that is a function symbol of arity 0. The arity 
of a symbol / is denoted as arity(f). E may be considered as the set 
of function symbols in a program. Let T(<P) be the set of all terms 
over <P. T(E) is the set of all possible values that a program variable 
can take. We shall use regular term languages over E as types. 

A type is represented by a ground term constructed from another 
ranked alphabet II and {n, U, ~, 1, 0}, called type constructors. It is 
assumed that (II U {n, U, ~, 1, 0}) fl E = 0. Thus, a type expression 
is a term in T(i7 U {n, U, ~, 1, 0}). The denotations of type con- 
structors in II are determined by type definitions whilst n,U,~,l 
and have fixed denotations that will be given soon. 

Several equivalent formalisms such as tree automata [14 1|, reg- 



ular term grammars [|14],[1(],|| and regular unary logic programs [^ 



have been used to define regular types. We define types by type 
rules. A type rule is a production rule of the form c(Ci, • • • , Cm) — > r 
where c G II, Ci, • • • , Cm are different type parameters and r G 
T(EU II U S m ) where E m = {Ci, • • • ,Cm}- The restriction that every 
type parameter in the righthand side of a type rule must occur in 



the lefthand side of the type rule is often referred to as type pre- 



serving |30| and has been used in all the type definition formalisms. 
Note that overloading of function symbols is permitted as a function 
symbol can appear in the righthand sides of many type rules. We 

denote by A the set of all type rules and define E = f Uce/7 ^arity(c) ■ 
(II, E, A) is a restricted form of context-free term grammar. 

Example 1. Let E = {0, sQ, nil, cons(, )} and LI = {Nat, Even, ListQ}. 
A defines natural numbers, even numbers, and lists where 

C Nat -> | s(Nat), 
A — < Even — > | s(s(Even)), 

I List (C) — > nil | cons((, List(()) 

where, for instance, Nat — > | s(Nat) is an abbreviation of two rules 
Nat -> and Nat -> s(Nat). 

I 

A is called simplified if r in each production rule c(£i, ■ • • , Cm) — * t 
is of the form f(r\, ■ ■ ■ , r n ) such that each Tj, for 1 < j < n, is either 
in E m or of the form d(([, ■ ■ ■ , (' k ) and ([, ■ ■ ■ , (' k G E m . We shall 
assume that A is simplified. There is no loss of generality to use 
a simplified set of type rules since every set of type rules can be 
simplified by introducing new type constructors and rewriting and 
adding type rules in the spirit of [[UJ . 

Example 2. The following is the simplified version of the set of type 

rules in example]]]. E = {0, s(),nil, cons(, )}, IT — {Nat, Even, Odd, ListQ} 

and 



A 



f Nat -> | s(Nat), Even -> | s(Odd), 

1 Odd — > s(Even), List((Q — > mZ | coris(C, List(Q) 



A type valuation is a mapping from £; to T(i7U{n, U, ~, 1, 0}). 
The instance 0(-R) of a production rule under is obtained by re- 
placing each occurrence of each type parameter ( in R with 0(C)- 
E.g., List(Natn(~Everi)) -> cons(N "atn(~ Even), List(Natn(~ Even))) 
is the instance of List(() — > co?t,s(C, List(Q) under a type valuation 
that maps C to Nat\l(~Even). Let 

ground(A) = {<f>(R) \ ReAA(f)e(E^ 7{J1 U {n, U, ~, 1, 0}))} 
U {1^/(1,- ..,1) | feE} 



ground(A) is the set of all ground instances of grammar rules in A 
plus rules of the form 1 — > f(l, ■••,]-) for every / G E. 

Given a set A of type definitions, the type denoted by a type 
expression is determined by the following meaning function. 

[l] A d ^T(E) 

[OL = 
[e^e^ = I^L n 

[EiUE% A d = f m A U [E 2 ] A 
[~E] A ^ T(E) - [E\ A 
M A = U {/(*!> ■■■,t n )\Vl<i<n.t i e [E t ] A } 

(uj^f(E 1 ,-,E n ))eground(A) 

[] A gives fixed denotations to n, U, ~, 1 and 0. n, U and ~ are 
interpreted by [] A as set intersection, set union and set complement 
with respect to T(E). 1 denotes T(E) and the empty set. 

Example 3. Let A be that in example |2|. We have 

[Nat} A = {0,s(0),s(s(0)),---} 
[Even] A = {0,s(s(0)),s(s(s(s(0)))),---} 
[Natn~Even] A = {s(0), s(s(s(0))), s(s(s(s(s(0))))) , ■ ■ ■} 
[List(Natn~Everi)] A = {cons (s(0), nil), cons (s (s (s (0))) , nil) , • • •} 

The lemma |5| in the appendix states that every type expression 
denotes a regular term language, that is, a regular type. 

We extend [] A to sequences 9 of type expressions as follows. 

Ma = (4 
1(E) • 0'] A ^ [E] A x [9] A 

where e is the empty sequence, • is the infix sequence concatenation 
operator, (E) is the sequence consisting of the type expression E 
and x is the Cartesian product operator. As a sequence of type 
expressions, e can be thought of consisting of zero instance of 1. We 
use A to denote the sequence consisting of zero instance of and 
define [A] A = 0. 

We shall call a sequence of type expressions simply a sequence. 
A sequence expression is an expression consisting of sequences of 



the same length and n, U and ~. The length of the sequences in a 
sequence expression 9 is called the dimension of 9 and is denoted by 
Let 9, 9\ and 9 2 be sequence expressions of the same length. 

[9 1 U9 2 ] A d ^[9 1 ] A U[9 2 ] A 

[~9] A ^T(Z)x...xT(Z)-[9] A 

» » ' 

times 

A conjunctive sequence expression is a sequence expression of the 
form 71 A • ■ • A 7 m where 7, for, 1 < i < m, are sequences. 

3 Emptiness Algorithm 

This section presents an algorithm that decides if a type expression 
denotes the empty set with respect to a given set of type definitions. 
The algorithm can also be used to decide if (the denotation of) one 
type expression is included in (the denotation of) another because 
Ei is included in E 2 iff Ei\l^E 2 is empty. 

We first introduce some terminology and notations. A type atom 
is a type expression of which the principal type constructor is not a 
set operator. A type literal is either a type atom or the complement 
of a type atom. A conjunctive type expression C is of the form n^/l; 
with lj being a type literal. Let a be a type atom. T{a) defined below 
is the set of the principal function symbols of the terms in [a] A . 

F{a) ^ {/ e E I 3d • • • G-((« - /(d, " " " , 00) e ground(A))} 
Let / G E. Define 

= f {(cci, ■ • ■ , otk) I (« — > f(ai, • - ■ , e ground(A)} 

We have = {<*x, -•-,**> | Both ^(a) and 

are finite even though ground(A)) is usually not finite. 
The algorithm repeatedly reduces the emptiness problem of a type 
expression to the emptiness problems of sequence expressions and 
then reduces the emptiness problem of a sequence expression to the 
emptiness problems of type expressions. Tabulation is used to break 
down any possible loop and to ensure termination. Let O be a type 
expression or a sequence expression. Define empty (O) = ([0] A = 0). 



3.1 Two Reduction Rules 



We shall first sketch the two reduction rules and then add tabulation 
to form an algorithm. Initially the algorithm is to decide the validity 
of a formula of the form 

empty(E) (1) 
where E is a type expression. 

Reduction Rule One. The first reduction rule rewrites a formula of 
the form ([I]) into a conjunction of formulae of the following form. 

empty (a) (2) 

where a is a sequence expression where ~ is applied to type expres- 
sions but not to any sequence expression. 

It is obvious that a type expression has a unique (modulo equiva- 
lence of denotation) disjunctive normal form. Let DNF(E') be the dis- 
junctive normal form of E. empty (i?) can written into Ac e DNF(E) empty (C). 
Each C is a conjunctive type expression. We assume that C contains 
at least one positive type literal. This doesn't cause any loss of gen- 
erality as [lnC]^ = [C\ A for any conjunctive type expression C. We 
also assume that C doesn't contain repeated occurrences of the same 
type literal. 

Let C = n!<j< m u;j n rii<j< n ~Tj where Ui and Tj are type atoms. 

The set of positive type literals in C is denoted as pos(C) = f {ui | 1 < 
i < m} while the set of complemented type atoms are denoted as 

neg(C) = f {tj | 1 < j < n). lit(C) denotes the set of literals oc- 
curring in C. By lemma |3] in the appendix, empty (C) is equivalent 
to 

empty((n wepos(C )(U^))n(n Tenefl(c7 )~(U^))) 1 ] 

The intuition behind the equivalence is as follows. [C] A is empty 
iff, for every function symbol /, the set of the sequences (ti, ■ ■ ■ , tk) 
of terms such that f(ti, ■ ■ ■ ,t k ) G [C\ A is empty. Only the function 
symbols in n Qepos ( ( 7)^ : '(Q;) need to be considered. 

We note the following two special cases of the formula (|3D . 

(a) If n Qg p OS (c)^-'(Q;) = then the formula (||) is true because A0 = 

true. In particular, J-(0) = 0. Thus, if G pos(C) then n Q , gpos (c)jF(a) = 
and hence the formula (||) is true. 



(b) If A{ = for some r G ne^C) then UAf = (0, • • • , 0) and 
~(L_lAf) = (1, • ■ ■ , 1). Thus, t has no effect on the subformula 
for / when A{ = 0. 

In order to get rid of complement operators over sequence sub- 
expressions, the complement operator in ~(L_lAf) is pushed inwards 
by the function push defined in the following. 

push(r^(U ieI ^i)) = n i&I push(^i) 
push{~(E u E 2 ,---,E k )) d ^ U 1<l<k ( 1, • • • , 1, ~E h 1, • • • , 1 ) for k > 1 

" * ' V v ' 

l-l k-l 

push(^e) == A 

It follows from De Morgan's law and the definition of [] A that 
\jwsh(~(uA())] A = l~(L\A f T )] A . Substituting push(~(UA£)) for ~(l_l.4£) 
in the formula (|3]) gives rise to a formula of the form (Q). 

Reduction Rule Two. The second reduction rule rewrites a formula 
of the form to a conjunction of disjunctions of formulae of the 
form [1|. Formula is written into a disjunction of formulae of the 
form. 

empty (r) 

where f be a conjunctive sequence expression. 

In the case \\r\\ = 0, by lemma ^ in the appendix, empty (.T) can 
be decided without further reduction. If A G r then empty (.T) is true 
because [A] A = 0. Otherwise, empty(r) is false because [r] A = {e}. 

In the case \\r\\ ^ 0, empty (T) is equivalent to 

Vi<j<|[rnempty(r|j) 

where, letting r = 7 1 n • • • r^, r\j = \~\i<i<kll with 7? being the j th 
component of 7^. Note that r{j is a type expression and empty (-TJj) 
is of the form |l| 

3.2 Algorithm 

The two reduction rules in the previous section form the core of the 
algorithm. However, they alone cannot be used as an algorithm as 
a formula empty (£7) may reduce to a formula containing empty (i?) 



as a sub-formula, leading to nontermination. Suppose E = {f(),a}, 
n = {Null} and A = {Null -> f(Null)}. Clearly, empty (Null) is 
true. However, by the first reduction rule, empty(Null) reduces to 
empty ({Null)) which then reduces to empty (Null) by the second 
reduction rule. This process will not terminate. 

The solution, inspired by P0[ , is to remember in a table a partic- 



ular kind of formulae of which truth is being tested. When a formula 
of that kind is tested, the table is first looked up. If the formula is 
implied by any formula in the table, then it is determined as true. 
Otherwise, the formula is added into the table and then reduced by 
a reduction rule. 

The emptiness algorithm presented below remembers every con- 
junctive type expression of which emptiness is being tested. Thus 
the table is a set of conjunctive type expressions. Let G\ and C 2 be 
conjunctive type expressions. We define (C\ z< C 2 ) = (lit(Ci) ~D 
Ut(C 2 )). Since Q = n lelim) l, Ci =< C 2 implies [Cj\ A C [C 2 ] A and 
hence (C\ -< C 2 ) A empty (C 2 ) implies empty (Ci). 

Adding tabulation to the two reduction rules, we obtain the fol- 
lowing algorithm for testing the emptiness of prescriptive regular 
types. Let 

B f c = (n^ posic) (uAl))n(n Tenegi c)push(~(uA f T ))). 

etype(E) d = etype(E,0) (4) 
etype(E,&) d = VC G DNF(E).etypejamj(C, W) (5) 

dcf 

etype_conj(C, i I') = 

true, if pos(C) n neg(C) / 0, /„\ 

true, if 3C" E&.C <C, ( ' 

V/ e n aepos(C )^(a)-eseq(B f c , & U {C}), otherwise. 



eseq(0, W) d = Vr 6 DNF(<9). eseq_conj(r, W) (7) 




if ||r|| = 0AAer, 
if |jr|| =OA/i0r, 

.etype(rij,&) if ||r|| # 0. 



eseq_conj(r, <P) = f { false if ||r|| = A A T, (8) 



Equation f| initialises the table to the empty set. Equations ^| 
and |6] implement the first reduction rule while equations [7| and |8| im- 
plement the second reduction rule. etype(,) and etype_conj(,) test 
the emptiness of an arbitrary type expression and that of a con- 
junctive type expression respectively. eseq(,) tests emptiness of a 



sequence expression consisting of sequences and n and U operators 
while eseq-Conj(, ) tests the emptiness of a conjunctive sequence ex- 
pression. The expression of which emptiness is to be tested is passed 
as the first argument to these functions. The table is passed as the 
second argument. It is used in etypejconj(, ) to detect a conjunctive 
type expression of which emptiness is implied by the emptiness of 
a tabled conjunctive type expression. As we shall show later, this 
ensures the termination of the algorithm. Each of the four binary 
functions returns true iff the emptiness of the first argument is im- 
plied by the second argument and the set of type definitions. 

Tabling any other kind of expressions such as arbitrary type ex- 
pressions can also ensure termination. However, tabling conjunctive 
type expressions makes it easier to detect the implication of the 
emptiness of one expression by that of another because lit{C) can 
be easily computed given a conjunctive type expression C . In an im- 
plementation, a conjunctive type expression C in the table can be 
represented as lit(C). 

The first two definitions for etype_conj (C , \P) in equation |6] ter- 
minates the algorithm when the emptiness of C can be decided by 
C and 9 without using type definitions. The first definition also ex- 
cludes from the table any conjunctive type expression that contains 
both a type atom and its complement. 

3.3 Examples 

We now illustrate the algorithm with some examples. 

Example 4- Let type definitions be given as in example 0. The tree 
in figure [1] depicts the evaluation of etype(Nat\l~Even\l~Odd) by 
the algorithm. Nodes are labeled with function calls. We will identity 
a node with its label. Arcs from a node to its children are labeled 
with the number of the equation that is used to evaluate the node. 
Abbreviations used in the labels are defined in the legend to the 
right of the tree. Though [A]^ = \B] A , A and B are syntactically 
different type expressions. The evaluation returns true, verifying 
[Natn~Evenr\~Odd] A = 0. Consider etype_conj(B, {A}). We have 
B 2< A as lit(A) = lit(B). Thus, by equation || etype_conj(B, {A}) = 
true. 



etype(A) Legend: 

A = Natn~Evenn~Odd 
B = Natn~Oddn~Even 



etype(A,0) 

etype_conj(A, 0) 
A "\^5 



eseg(enyl, {A}) 


eseg(C,{A}) 


|6 


I 6 


eseg_conj(en/l, {A}) 


eseq_conj(C, {A}) 


I 7 


I 7 


true 


etj/pe(B,{A}) 




etype_conj(B. {A}) 




I 5 

true 



C = {Nat)n{~Odd}n{~Even) 



Fig. 1. Evaluation of etype(Natr\^Evenr\^Odd)) 



Example 5. Let type definitions be given as in example |^. The tree 
in figure H depicts the evaluation of etype(List(Even\l~Nat)) by the 
algorithm. The evaluation returns false, verifying [List(Evenn~Nat)] A ^ 
0. Indeed, [List(Evenn~Nat)] A = {nil}. The rightmost node is not 
evaluated as its sibling returns false, which is enough to establish 
the falsity of their parent node. 



etype(A) Legend: 

A = List(Evenn~Nat) 
B = Even\l~Nat 



|(3) 

etype(A, 0) 

1(4) 



etypejzonj{A, 0) 
(5) /nil A v ^(5)/cons(,) 
eseq{e,{A}) eseq((B, A), {A}) 
|(6) 

eseqjconj(t, {A}) 

| (7) 

false 



Fig. 2. Evaluation of etype{List(EvenU~N at)) 



I 



Example 6. The following is a simplified version of the type defini- 
tions that is used in [^4j to show the incorrectness of the algorithm 
by Dart and Zobel for testing inclusion of one regular type in an- 
other [0. 

Let n = {a,/3,6,a,uj,(,v}, Z = {a,b,g(),h(,)} and 



a -> g(u), P -> g(9) | g(o-), 9 -> a | /i(6>, C), cr -»• 6 | 77), 
a; — > a | 6 | £) | /i(a;, 77), £ — > a, 77 — > & 



Let £ = g(h(h(a,b),a)). t G [a]^ and £ ^ [0]^, see example 3 



111 



24| for more details. So, [cc]^ ^ {0\a- This is verified by our 
algorithm as follows. Let $1 = {an~/3} and ^ 2 = $1 U {o;n~6'n~cr}. 
By applying equations U |5], ||, [7|, || and || in that order, we have 
etype(a>n~fl) = etype_conj(ujn^-'9r\^-'a, By equation || we have 

etype(an~(3) = eseq(enA\le, A eseq{eUeUA, V 2 ) A eseq(0, V 2 ) 

where O = {{u, C)U<w, »7))n((~0, 1)U(1, ~0)n«~(7, 1)U(1, -77)). We 
choose not to simplify expressions such as erien~/l so as to make 
the example easy to follow. By applying equations [7| and ||, we 
have both eseq(e\lA\le, $2) = true and eseq(e\le\lA, \P 2 ) = true. So, 
etype(a\l~{3) = eseq(0,& 2 ). Let r = (a;, C)n(~0, l)n(l, ~7/). To 
show etype(a\l~f3) = false, it suffices to show eseq_conj(r,\P 2 ) = 
false by equation |7] because r G DNF((9) and e£ype(an~/3) = 

Figure | depicts the evaluation of eseq-conj (r, \P 2 ). The node that 
is linked to its parent by a dashed line is not evaluated because one of 
its siblings returns false, which is sufficient to establish the falsity of 
its parent. It is clear from the figure that etype_conj(0 ,$2) = false 
and hence etype(a\l~j3) = false. 

I 



4 Correctness 

This section addresses the correctness of the algorithm. We shall 
first show that tabulation ensures the termination of the algorithm 
because the table can only be of finite size. We then establish the 
partial correctness of the algorithm. 



etype_conj(r, #2) 
etyp(ujr\^8 , $2) etype(£r\r^r] , $2) 



etyp_conj(cijn^6, ^2) 

A \5/b"~"--^_5/h(,) 
eseg(enyl, ^3) eseq(erie, ^3) eseg(6>i, #3) 

6| |6 
eseg_cory (en/1, •f's) eseg_conj(erie, <f3) 



7 

true 



I? 

false 



etypejxmj (£r\~7], $2) 
I 5 

eseg(ene, if^) 
l« 

eseq_conj(erie, $4) 

I? 

false 



Legend: 

0i = « w ,Ou(a; j7? ))n((~M>u<i,~O) 
ft^ftu {tun~6»} 

#4 = *"2 u {Cn~?7} 

r = ( w ,c>n(~e,i)n(i J ~r ? ) 



Fig. 3. Evaluation of etype-conj (T, ^2) 



4.1 Termination 

Given a type expression 2?, a top-level type atom in E is a type 
atom in E that is not a sub-term of any type atom in E. The 
set of top-level type atoms in E is denoted by TLA(-E). For in- 
stance, letting E = ~List(Nat)L\Tree(Nat\l~Even), TLA(E) = 
{List(Nat),Tree(Nat\l~Even)}. We extend TLA(-) to sequences 

by TLA((i?i, E 2 ,---,E k )) d ^ Ui<,< fc TLA(.Ej). 

Given a type expression E , the evaluation tree for etype(E ) con- 
tains nodes of the form etype(E , \P) , etype-conj (C,\P), eseq(0,\[ / ) 
and eseq-Conj(r, in addition to the root that is etype(E ). Only 
nodes of the form etype_conj(C,\l/) add conjunctive type expres- 
sions to the table. Other forms of nodes only pass the table around. 
Therefore, it suffices to show that the type atoms occurring in the 
first argument of the nodes are from a finite set because any con- 
junctive type expression added into the table is the first argument 
of a node of the form etype-conj (C, 

The set RTA(i? ) of type atoms relevant to a type expression E 
is the smallest set of type atoms satisfying 

- TLA(E'o) C RTA(£ ), and 

— if r is in RTA(i? ) and r — > /(ti, r 2 , ■ • • , r fc ) is in ground(A) then 
TLA(ri) C RTA(£ ) for 1 < i < k. 



The height of Tj is no more than that of r for any r — > /(ti, t 2 , • • • , r^) 
in ground(A). Thus, the height of any type atom in RTA(E'o) is 
finite. There are only a finite number of type constructors in 77. 
Thus, RTA(E'o) is of finite size. It follows by examining the algorithm 
that type atoms in the first argument of the nodes in the evaluation 
tree for etype(Eo) are from RTA(E'o) which is finite. Therefore, the 
algorithm terminates. 

4.2 Partial Correctness 

The partial correctness of the algorithm is established by showing 
etype(E ) = true iff empty (E ). Let \P be a set of conjunctive type 

expressions. Define pq, = f Acg^emptyt^C). The following two lemmas 
form the core of our proof of the partial correctness of the algorithm. 

Lemma 1. Let & be a set of conjunctive type expressions, E a type 
expression, C a conjunctive type expression, a sequence expression 
and r a conjunctive sequence expression. 

(a) If pq, |= empty(C) then etype_conj{C,' [ I/) = true, and 

(b) If pq, |= empty(-E) then etype(E,\I') = true, and 

(c) If pq, |= empty(-T) then etype^r,^) = true, and 

(d) If pq, |= empty((9) then etype(0, x I / ) = true. 

Proof. The proof is done by induction on the size of the complement 
of x I> with respect to the set of all possible conjunctive type expressions 
in which type atoms are from RTA(i?o) where Eq is a type expression. 

Basis. The complement is empty. & contains all possible conjunc- 
tive type expressions in which type atoms are from RTA(E'o). We 
have C G 1/ and hence etype_conj (C ', W) = true by equation^. There- 
fore, (a) holds, (b) follows from (a) and equation^ (c) follows from 

(b) , equation and lemma in the appendix, and (d) follows from 

(c) and equation 0. 

Induction. By lemma [| in the appendix, pq, \= empty (C) im- 
plies pq, |= empty(23£) for any f E n aepos ( C ) F(a). Thus, pq, u{C } \= 
empty(£>£<) . The complement of\l/U{C} is smaller than the comple- 
ment of\P. By the induction hypothesis, we have eseq(B(j, &U{C}) = 
true. By equation^, etype^conj (C , \P) = true. Therefore, (a) holds, 
(b) follows from (a) and equation [|. (c) follows from (b), equation [| 
and lemma^ in the appendix and (d) follows from (c) and equation^. 
This completes the proof of the lemma. 



I 



Lemma [I] establishes the completeness of etype(, ), etypejconj (, ), 
eseq(, ) and eseq-Conj(, ) while the following lemma establishes their 
soundness. 

Lemma 2. Let \P be a set of conjunctive type expressions, E a type 
expression, C a conjunctive type expression, a sequence expression 
and r a conjunctive sequence expression. 

(a) pq, \= empty(C) if etype-Conj(C,\P) = true, and 

(b) pq, |= empty (.E) if etype(E,W) = true, and 

(c) pq, |= empty (.T) if etype(r,\P) = true, and 

(d) pq, \= empty (@) if etype^O, 1 ^) = true. 

Proof. It suffices to prove (a) since (b),(c) and (d) follow from (a) 
as in lemma\^. The proof is done by induction on dp(C, \P) the depth 
of the evaluation tree for etype-Conj(C,\P) . 

Basis. dp(C,\P) = 1. etype-Conj(C,)P) = true implies either (i) 
pos(C)C]neg(C) ^ or (li) 3C G ^ C . In case (i), empty(C) 
is true and pq, \= empty(C). Consider case (ii). By the definition of 
-< and pq,, we have etype-Conj(C, \P) = true implies pq, |= empty (C). 

Induction. dp(C,\P) > 1. Assume etype-Conj(C, l I r ) = true and 
pq, \= -iempty(C). By lemma [|, there is f G C\ a&pos ^c)J r {a) such 

that pq, |= -iempty(S^). We have pqu{c} |= "'empty (£>£). dpi^B^lPU 

{C}) < dp(C, I/). By the induction hypothesis, we have etuple{B c , 

{C}) = false for otherwise, pq,\j{c} |= Be- By equation^, etypejzonj{C = 

false which contradicts etype_conj(C,\P) = true. So, pq, \= empty(C) 

if etype-Conj(C,\P) = true. This completes the induction and the 

proof of the lemma. 
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The following theorem is a corollary of lemmas [l] and [| 
Theorem 1. For any type expression E, etype(E) = true iffemptj(E). 

Proof. By equation^, etype(E) = etype(E,0). By lemma^.(b) and 
lemma %-(b), we have etype(E,$) = true iff p® \= empty (E). The 
result follows since p® = true. 



5 Complexity 



We now address the issue of complexity of the algorithm. We only 
consider the worst-case time complexity of the algorithm. The time 
spent on evaluating etype(E ) for a given type expression E can be 
measured in terms of the number of nodes in the evaluation tree for 
etype(E ). 

The algorithm cycles through etype(, ), etype_conj (, ), eseq(, ) and 
eseq-Conj(,). Thus, children of a node of the form etype(E,\P) can 
only be of the form etype-Conj(C, and so on. 

Let \S\ be the number of elements in a given set S. The largest 
possible table in the evaluation of etype(E ) contains all the con- 
junctive type expressions of which type atoms are from BTA(E ). 
Therefore, the table can contain at most 2l RTA (- E °) conjunctive type 
expressions. So, the height of the tree is bounded by 0(2^ RTA ^ E °^). 

We now show that the branching factor of the tree is also bounded 
by O (2\ RTA ( E °^) . By equation the number of children of etype(E, \P) 
is bounded by two to the power of the number of type atoms in E 
which is bounded by |RTA(£"o)| because E can only contain type 
atoms from RTA(£7o). By equation the number of children of 
etype-Conj(C,ty) is bounded by \S\. The largest number of chil- 
dren of a node eseq(0, SP) is bounded by two to the power of the 
number of sequences in where = B c . For each r G neg(C), 
\push{~{uA£))\ is 0(arity(f)) and \C\ < |RTA(£ )|. Thus, the 
number of sequences in is (D(arity(f) * |RTA(E' )|) and hence the 
number of children of eseq{0 1 x I/) is 0(2^ RTA ^ E °^) since arity(f) is a 
constant. By equation || the number of children of eseq-Conj(r,\P) 
is bounded by max/gx 1 arity(f). Therefore, the branching factor of 
the tree is bounded by C>(2l RTA ^ ^ ) . 

The above discussion leads to the following conclusion. 

Proposition 1. The time complexity of the algorithm is 0(2^ RTA ^ E °^)). 

I 

The fact that the algorithm is exponential in time is expected be- 
cause the complexity coincides with the complexity of deciding the 
emptiness of any tree automaton constructed from the type expres- 
sion and the type definitions. A deterministic frontier-to-root tree 
automaton recognising [E ] A will consist of 2^ RTA ^ E °^ states as ob- 
served in the proof of lemma |5|. It is well-known that the decision 



of the emptiness of the language of a deterministic frontier-to-root 
tree automaton takes time polynomial in the number of the states 
of the tree automaton. Therefore, the worst-case complexity of the 
algorithm is the best we can expect from an algorithm for deciding 
the emptiness of regular types that contain set operators. 



6 Conclusion 

We have presented an algorithm for deciding the emptiness of pre- 
scriptive regular types. Type expressions are constructed from type 
constructors and set operators. Type definitions prescribe the mean- 
ing of type expressions. 

The algorithm uses tabulation to ensure termination. Though the 
tabulation is inspired by Dart and Zobel [[HJ, the decision problem 



we consider in this paper is more complex as type expressions may 
contain set operators. For that reason, the algorithm can also be 
used for inclusion and equivalence problems of regular types. The 
way we use tabulation leads to a correct algorithm for regular types 
while the Dart-Zobel algorithm has been proved incorrect for regular 
types in general. To the best of our knowledge, our algorithm is 
the only correct algorithm for prescriptive regular types. 

In addition to correctness, our algorithm generalises the work of 
Dart and Zobel |T(| in that type expressions can contain set op- 



erators and type definitions can be parameterised. Parameterised 
type definitions are more natural than monomorphic type defini- 



tions [fl2| , |26| , p2| while set operators makes type expressions concise. 
The combination of these two features allows more natural type dec- 
larations. For instance, the type of the logic program append can be 
declared or inferred as append(List(a), List(f3), List(aL\j3)). 

The algorithm is exponential in time. This coincides with decid- 
ing the emptiness of the language recognised by a tree automaton 
constructed from the type expression and the type definitions. How- 
ever, the algorithm avoids the construction of the tree automaton 
which cannot be constructed a priori when type definitions are pa- 
rameterised. 

Another related field is set constraint solving |13|^ J20|JT8|JTTJ| . How- 
ever, set constraint solving methods are intended to infer descriptive 
types rather than for testing the emptiness of a prescriptive 



type [28|. Therefore, they are useful in different settings from the al- 



gorithm presented in this paper. In addition, algorithms proposed for 
solving set constraints |3],f|,|2],[iJ are not applicable to the emptiness 
problem we considered in this paper. Take for example the construc- 
tor rule in HQ which states that emptiness of f(Ei, E 2 , ■ ■ ■ , E m ) is 
equivalent to the emptiness of E^ for some 1 < i < m. However, 
empty (List (0)) is not equivalent to empty (0). The latter is true 
while the former is false since [List(O)]^ = {nil}. The constructor 
rule doesn't apply because it deals with function symbols only but 
doesn't take the type definitions into account. 

References 

1. A. Aiken, D. Kozen, M. Vardi, and E. Wimmers. The complexity of set constraints. 
In Proceedings of 1993 Computer Science Logic Conference, pages 1-17, 1992. 

2. A. Aiken and T.K. Lakshman. Directional type checking of logic programs. In 
B. Le Charlier, editor, Proceedings of the First International Static Analysis Sym- 
posium, pages 43-60. Springer- Verlag, 1994. 

3. A. Aiken and E. Wimmers. Solving systems of set constraints. In Proceedings of 
the Seventh IEEE Symposium on Logic in Computer Science, pages 329-340. The 
IEEE Computer Society Press, 1992. 

4. A. Aiken and E. Wimmers. Type inclusion constraints and type inference. In 
Proceedings of the 1993 Conference on Functional Programming Languages and 
Computer Architecture, pages 31-41, Copenhagen, Denmark, June 1993. 

5. C. Beierle. Type inferencing for polymorphic order-sorted logic programs. In 
L. Sterling, editor, Proceedings of the Twelfth International Conference on Logic 
Programming, pages 765-779. The MIT Press, 1995. 

6. L. Cardelli and P. Wegner. On understanding types, data abstraction, and poly- 
morphism. ACM computing surveys, 17(4):471-522, 1985. 

7. M. Codish and V. Lagoon. Type dependencies for logic programs using aci- 
unification. In Proceedings of the 1996 Israeli Symposium on Theory of Computing 
and Systems, pages 136-145. IEEE Press, June 1996. 

8. H. Comon, M. Dauchet, R. Gilleron, D. Lugiez, S. Tison, and M. Tommasi. Tree 
Automata Techniques and Applications. Draft, 1998. 

9. P.W. Dart and J. Zobel. Efficient run-time type checking of typed logic programs. 
Journal of Logic Programming, 14(l-2):31-69, 1992. 

10. P.W. Dart and J. Zobel. A regular type language for logic programs. In Frank 
Pfenning, editor, Types in Logic Programming, pages 157-189. The MIT Press, 
1992. 

11. P. Devienne, J-M. Talbot, and S. Tison. Co-definite set constraints with member- 
ship expressions. In J. Jaffar, editor, Proceedings of the 1998 Joint Conference and 
Symposium on Logic Programming, pages 25-39. The MIT Press, 1998. 

12. T. Fruhwirth, E. Shapiro, M.Y. Vardi, and E. Yardeni. Logic programs as types 
for logic programs. In Proceedings of Sixth Annual IEEE Symposium on Logic in 
Computer Science, pages 300-309. The IEEE Computer Society Press, 1991. 

13. J. P. Gallagher and D.A. de Waal. Fast and precise regular approximations of logic 
programs. In M. Bruynooghe, editor, Proceedings of the Eleventh International 
Conference on Logic Programming, pages 599-613. The MIT Press, 1994. 



14. F. Gecseg and M. Steinby. Tree Automata. Akademiai Kiado, 1984. 

15. F. Gecseg and M. Steinby. Tree languages. In G. Rozenberg and A. Salomma, 
editors, Handbook of Formal Languages, pages 1-68. Springer- Verlag, 1996. 

16. M. Hanus. Horn clause programs with polymorphic types: semantics and resolution. 
Theoretical Computer Science, 89(1):63-106, 1991. 

17. N. Heintze and J. Jaffar. A finite presentation theorem for approximating logic 
programs. In Proceedings of the seventh Annual ACM Symposium on Principles of 
Programming Languages, pages 197-209. The ACM Press, 1990. 

18. N. Heintze and J. Jaffar. A decision procedure for a class of set constraints. Tech- 
nical Report CMU-CS-91-110, Carnegie-Mellon University, February 1991. (Later 
version of a paper in Proc. 5th IEEE Symposium on LICS). 

19. N. Heintze and J. Jaffar. Semantic types for logic programs. In Frank Pfenning, 
editor, Types in Logic Programming, pages 141-155. The MIT Press, 1992. 

20. N. Heintze and J. Jaffar. Set constraints and set-based analysis. In Alan Borning, 
editor, Principles and Practice of Constraint Programming, volume 874 of Lecture 
Notes in Computer Science. Springer, May 1994. (PPCP'94: Second International 
Workshop, Orcas Island, Seattle, USA). 

21. D. Jacobs. Type declarations as subtype constraints in logic programming. SIG- 
PLAN Notices, 25(6):165-73, 1990. 

22. L. Lu. Type analysis of logic programs in the presence of type definitions. In 
Proceedings of the 1995 ACM SIGPLAN Symposium on Partial Evaluation and 
Semantics-Based program manipulation, pages 241-252. The ACM Press, 1995. 

23. L. Lu. A polymorphic type analysis in logic programs by abstract interpretation. 
Journal of Logic Programming, 36(l):l-54, 1998. 

24. L. Lu and J. Cleary. On Dart-Zobel algorithm for testing regular type inclusion. 
Technical repo rt, Department of Computer Scien ce, The University of Waikato, 
October 1998. [http://xxx.lanl.gov/ps/cs/981000j . 

25. P. Mishra. Towards a theory of types in Prolog. In Proceedings of the IEEE inter- 
national Symposium on Logic Programming, pages 289-298. The IEEE Computer 
Society Press, 1984. 

26. A. Mycroft and R.A. O'Keefe. A polymorphic type system for Prolog. Artificial 
Intelligence, 23:295-307, 1984. 

27. Frank Pfenning, editor. Types in logic programming. The MIT Press, Cambridge, 
Massachusetts, 1992. 

28. U.S. Reddy. Types for logic programs. In S. Debray and M. Hermenegildo, editors, 
Logic Programming. Proceedings of the 1990 North American Conference, pages 
836-40. The MIT Press, 1990. 

29. M. Soloman. Type definitions with parameters. In Conference Record of the Fifth 
ACM Symposium on Principles of Programming Languages, pages 31-38, 1978. 

30. J. Tiuryn. Type inference problems: A survey. In B. Roven, editor, Proceedings of 
the Fifteenth International Symposium on Mathematical Foundations of Computer 
Science, pages 105-120. Springer- Verlag, 1990. 

31. E. Yardeni, T. Fruehwirth, and E. Shapiro. Polymorphically typed logic programs. 
InK. Furukawa, editor, Logic Programming. Proceedings of the Eighth International 
Conference, pages 379-93. The MIT Press, 1991. 

32. E. Yardeni and E. Shapiro. A type system for logic programs. Journal of Logic 
Programming, 10(2):125-153, 1991. 

33. J. Zobel. Derivation of polymorphic types for Prolog programs. In J.-L. Lassez, ed- 
itor, Logic Programming: Proceedings of the fourth international conference, pages 
817-838. The MIT Press, 1987. 



Appendix 

Lemma 3. Let C be a conjunctive type expression. empty(C) iff 

Wf E r\ aepo8 (c)F(ot). 

empty ((n tJg p OS ( C .)(u^4£))n(n Tgne9 ( C .)~ 

(u^))) 

Proof. Let t be a sequence of terms and f a function symbol. By 
the definition of [] A , f(t) E [C\ A iff f E n aepos{C ) F{oc) and t E 

[n uepos(c) (uAl))] A \[(u Teneg(c) (uAl))] A . t e [n uepoa(C )(UAl))] A \ 
[(u Teneg(c) (uAl))] A ifft e [(n ueposic) (uAl))n(n T€negic) ~(uAl))] A . 

Thus, empty(C) iff empty ((n wepos ^(UAl))n(n Teneg (c)~ 
each f E r\ aepos (c) F{ol) . 

I 

Lemma 4. Let r be a conjunctive sequence expression. Then 

empty(r) iff Ui< J -||r||empty(r|j) 

Proof. Let \\T\\ = n and T = 7in7 2 n • • • \l~f m with ^ = (7^, 7^2, • • ■ , 7i,n)- 
We have [r] A = C\i<j<mbj\ A - We have r H = 7i,i n 72jn • • • ri7 mJ . 
31 < j < n.empty(rij) iff 31 < 3 < n. Hi<i< w [hAa = ® iff 
[f] A = <b iff empty (r). 



Lemma 5. [M] A is a regular term language for any type expression 
M. 

Proof. The proof is done by constructing a regular term grammar 
for M p^J. We first consider the case M E T(LJ U {1,0}). Let 
R= (KTA(M),E,Q,T,M) with 

T={(a^ /(a l5 • ■ ■ , a k )) E ground{A) \ a E RTA(A^)} 

R is a regular term grammar. It now suffices to prove that t E [M] A 
iffM^* R t. 

— Sufficiency. Assume M. =^>* R t. The proof is done by induction on 
derivation steps in M. =>* R t. 
• Basis. M. => R t. t must be a constant and M. — > t is in T 
which implies M. — * t is in ground(A) . By the definition of 

Ua- t e Ma- 



• Induction. Suppose M f(Mi, ■ • • , Mk) =>r~^ t. Then 
t = - • - , tjb) -^i =>r t with Hi < (n— 1). By the 
induction hypothesis, U G [MJ^ and hence t G [M] 4 fry t/ie 
definition of [] A . 

— Necessity. Assume t G [M]^. The proof is done by the height of 
t, denoted as height{t). 

• height(t) = implies that t is a constant, t G [M] A implies 
that M — > t is in ground(A) and hence M ^ t is in T. 
Therefore, M =^r t. 

• Let height(t) = n. Then t = f(t\, ■ ■ ■ ,tk)- t G [M] A implies 
that (M — ► f(Mi,---,Mk)) G ground(A) and U G [Mi]^. 
By the definition of T , we have (M — > f(Mi, ■ • ■ , Mk)) 
T. 5y tfie definition of RTA(-), we have Mi G RTA(A^). 
By the induction hypothesis, Mi => R fy. Therefore, M =>r 
f(Mi, ■ ■ ■ ,Ai k ) =>* R f{ti,---,h) = t. 

Now consider the case Ai G T(LT U {n, U, ~, 1, 0}). We complete 
the proof by induction on the height of Ai. 

— height(Ai) = 0. Then Ai doesn't contain set operator. We have 
already proved that [Ai] A is a regular term language. 

— Now suppose height(Ai) = n. If Ai doesn't contain set operator 
then the lemma has already been proved. If the principal type con- 
structor is one of set operators then the result follows immediately 
as regular term languages are closed under union, intersection 



and complement operators ji^JIq Jq/. It now suffices to prove the 



case Ai = c(Aii, ■■■ , Mi) with c G LT. Let AS — c(Xi, ■ ■ ■ , Xi) 
where each Xj is a different new type constructor of arity 0. 
Let n ! = TI{X U ■ ■ ■ , Xi}, E' = E U {x x , ■ ■ • , xj} and A' = A U 
{Xj — > xj\l < j < I}. [Af\ A , is a regular term language on 
E U {xi, • • • , Xi} because Af doesn't contain set operators. By the 
induction hypothesis, [Mj] A is a regular term language. By the 
definition of[\, we have 

[M] A = [A/U*i := [M 1 ] A ,--- J x l := [MU 



yi ' 



is 



which is a regular term language j\m\Tdj ,\^]. S[yi := S, j 
the set of terms each of which is obtained from a term in S by 
replacing each occurrence of yj with a (possibly different) term 
from S Vj . This completes the induction and the proof. 



The proof also indicates that a non- deterministic frontier-to-root 
tree automaton that recognises [M] A has |RTA(.M)| states and that 
a deterministic frontier-to-root tree automaton that recognises [M] A 
has C>(2l RTA (^)l) states. 



